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SOME  NEW  RESULTS  ON  GRUBBS'  ESTIMATORS 


DENNIS  A.  BRINDLEY  AND  RALPH  A.  BRADLEY* 


Consider  a  two-way  classification  with  n  rows  and  r  columns  and 


the  usual  model  of  analysis  of  variance  except  that  the  error  components 

.2 


of  the  model  may  have  heterogeneous  variances^ =  1-,.  ^  by 


columns.  -Jinibbsjarovided  unbiased  estimators  Qj  of  cr  that  depend  on 


column  sums  of  squared  residuals  S..^When  r  =  3,  the  joint  distributions 

J 


of  the  S.  and  the  Q.  are  given  for  the  first  time  in  closed  form. 

J  J 


Two  tests  proposed  by  Russell  and  Bradley  are  examined  when  r  =  3, 
one  for  variance  homogeneity  and  the  second  for  one  possible  disparate 
variance.  A  very  simple  distribution  is  found  for  the  test  statistic 
of  the  first  test  and  its  non-null  distribution  is  derived  also.  The 
distribution  of  the  second  test  statistic  was  known  to  be  the  central 
variance-ratio  distribution  in  the  null  case  and  now  its  ratio  to  a 
parameter  of  noncentrality  is  shown  to  have  that  same  distribution  in 
the  non-null  case. 


When  r  =  4,  n  =  4,\he  joint  distribution  of  the  S.  is  given  also 

J 


in  closed  form,  but  it  is 


difficult  to  use.  For  r  >  3,  an  approximate 


test  of  variance  homogenejjty  is  proposed,  based  on  an  extension  of  the 


Russell -Bradley  statistic.-3 Extensive  simulation  studies  show  that  the 
distribution  of  the  test  statistic  may  be  approximated  very  well  by  a 
chi-square  distribution. 


*  Dennis  A.  Brindley  is  a  research  scientist  with  Bristol-Myers 
Pharmaceutical  Research  and  Development  Division,  Evansville,  Indiana 
47630.  Ralph  A.  Bradley  is  Research  Professor,  Department  of  Statistics 
and  Computer  Science,  University  of  Georgia,  Athens,  Georgia  30602. 


1.  INTRODUCTION 


Consider  the  two-way  classification  of  observations  y.., 

•  J 

i  =  1 . .  j  s  1 .  r,  and  the  model. 


=  y. 


ID 


when  y.  represents  the  mean  response  of  row  i,  g.  represents  the  addi- 

*  J 

tional  effect  of  column  j,  Eg.  =  0,  and  the  e. •  are  independent,  zero- 

i  J  1 J 

J  2  2 

mean,  normal  variates  with  expectation  E(e^)  =  o,.  Model  (1)  differs 

I J  J 

from  the  usual  model  in  that  variances  are  column-related.  Grubbs  (1948) 
2 

estimated  using  Q-,  j  =  1,  ....  r,  where 

J  J 

r  i  r  ^ 

Qj  -  (n  -  l)(r  -  2)  Sj  ~  (n  -  l)(r  -  l)(r  -  2)  t^jSt  ^ 

and 

si  =  *  bn  '  -  y  i  +  y  )2  (3) 

J  ij  i- 

in  which  y .  ,  y  .  and  y  are  respectively  the  means  of  the  observa- 

i  •  •  J  »  » 

tions  in  the  i-th  row,  j-th  column,  and  complete  table.  Note  that 
o 

E (Q ■ )  =  a.,  but  Q.  may  be  negative  in  certain  situations. 

J  J  J 

Possible  use  of  Grubbs'  estimators,  and  inferences  based  upon  them, 
arises  frequently.  Grubbs  was  concerned  with  the  variabilities  of 
observer-used  electric  clocks  as  timing  instruments,  Ehrenberg  (1950) 
with  the  precisions  of  individuals  in  the  sensory  scoring  of  food  sam¬ 
ples,  and  Russell  and  Bradley  (1958)  with  an  application  to  three 


KsrsaEtKamaaerrvT rz  t:  <r  vr:  .-.v:  v».  ^  r*T 
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fermentors  In  a  distillery.  More  recently,  Snee  (1982)  examined  wheat- 
yield  data  and  multiple-head  machine  data  and  suggested  that  apparent 
variance  heterogeneity  may  be  an  indicator  and  manifestation  of  omitted 
Interaction  terms  needed  in  the  model.  A  substantial  bibliography  on 
Grubbs'  estimators  has  developed  and  was  summarized  by  Maloney  (1973). 

The  use  of  Grubbs'  estimators  has  been  hampered  by  the  difficul¬ 
ties  encountered  in  developing  appropriate  inference  procedures, 
although  some  results  are  available.  Russell  and  Bradley  gave  two 
procedures.  The  first  was  based  on  the  assumption, 

A:  Oj  =  o  ,  j  *  1,  ...,  (r  -  1),  (4) 

and  provided  a  test  of  the  hypothesis, 

H01:  °r  s  °2* 

They  showed  that 


F  *  [r(r  -  2)Q  +  EQ.)/r(EQ.  -  Q  )  (6) 
r  J  J  J  J  r 

has  the  central  F-distribution  with  (n  -  1)  and  (n  -  1 ) (r  -  2)  degrees 
of  freedom  under  A  and  HQ1.  In  the  special  case  with  r  =  3,  they  con¬ 
sidered  the  hypothesis. 


and  showed  that,  under  Hq2*  the  statistic. 


(7) 


S  =  -(n  -  l)log  T, 


(8) 


where 


T  =  3(0^2  +  QjQ3  +  Q2Q3)/(Q!  +  Q2  +  Q3)2»  (9) 

has  the  central  chi-square  distribution  with  2  degrees  of  freedom 
asymptotically  with  n.  Shukla  (1982)  proposed  a  Bartlett-type  statis¬ 
tic,  which,  in  the  notation  of  this  paper,  depends  on  log  T*  where 

T*  *  ns . /I r with  S’  =  ZS./r.  He  also  approximated  the  distribution  of 
j  3  j  J 

log  T*.  Johnson  (1962)  specified  j  and  j ' ,  j  i*  j',  and  used  S./S.,  to 

J  J 

2  2 

test  the  hypothesis  that  =  a^,,  obtaining  a  distribution  that  he 

J  J 

approximated  with  a  single  F-distribution. 

2  2 

Ellenberg  (1977),  under  the  hypothesis  that  Oj  *  o  ,  j  *  1,  ..., 
derived  the  exact  distribution  of  a  subset,  S^»  ...,  Sk»  k  r,  of 
Sj,  ....  Sf.  A  complicated  infinite  series  expression  resulted  that 
he  regarded  as  intractable  for  further  use. 

In  this  article,  a  direct  attempt  is  made  to  obtain  the  joint 
distribution  of  Sj,  ...»  Sr  and  of  Qj,  ....  Qp  that  is  partially  suc¬ 
cessful.  Simple  results  are  obtained  when  r  =  3,  the  most  important 
case.  Then  the  exact  distribution  of  T  in  (9)  is  derived  both  under 
H01  1>n  '7)  and  under  the  general  alternative  hypothesis,  HQ2:  +  a 

for  some  j  =  1,  2,  3.  The  non-null  distribution  of  F  in  (6)  is  found 
also.  When  r  =  4,  the  joint  distribution  of  Sj,  ...»  S^  is  obtained 
under  the  extension  of  (7)  for  r  =  4,  but  this  distribution  is  compli¬ 
cated,  although  given  in  closed  form.  In  general,  when  r  >  3,  distri¬ 
butional  problems  become  very  difficult;  an  approximate  test  procedure 
for  equality  of  variances  is  then  suggested. 


2.  DISTRIBUTION  THEORY 


We  seek  the  joint  distribution  of  S. . S  ,  r  >  3.  Let  y  be 

l  r  — 

the  column  vector  with  typical  element  y^,  the  elements  arranged  In 
lexicographic  order.  Define  An  to  be  the  (n  -  1)  x  n  matrix  with  the 
(n  -  1)  zero-sum  rows  of  the  n-dlmensional  Helmert  matrix  arranged  so 
that  row  s  has  s  elements  1/ [s(s  +  l)]1*  followed  by  the  element 
-s/[s(s  +  1)P  and  (n  -  s  -  1)  zero  elements.  The  usual  (n  -  1 ) (r  -  1) 
error  contrasts  may  be  written  as  the  elements  of 

e  =  (A  J0Ar  )y,  (10) 

where  the  typical  element  of  e  is  e^,  i  =  1,  ...»  (n  -  1), 
j  -  1,  ...»  (r  -  1),  arranged  In  lexicographic  order  and  A  n@  A  r  is  the 
direct  product  of  A  n  and  A  r . 

It  Is  clear  that  E(e)  *  0  and  it  may  be  shown  that  the  covariance 
matrix  of  e  is  Z  e  *  I  ©I,  where  Im  Is  the  m-dimensional  identity 
matrix.  The  matrix  I  is  (r  -  l)-square  symmetric  with  diagonal  elements, 

S  2  2  2 

(  Z  o.  +  s  o  ..)/s(s  +  1),  and  (s,  t)-elements, 
j3l  J  S+1 

(So?-  to?  . )/ [t(t  +  l)s(s  +  l)]4*,  t  <  s,  s  ■  I,  ....  (r  -  1). 
j3l  J  t+1 

The  form  of  Z  Indicates  that  e  may  be  partitioned  into  (n  -  1)  independent 

(r  -  l)-dimensional  vectors,  each  having  covariance  matrix  E, 

e'  3  (ej,  ....  e'_j).  In  effect,  e^,  ...,  e  ^  may  be  regarded  as  (n  -  1) 

Independent  observation  vectors  from  an  (r  -  l)-dimensional  multivariate 

normal  population  with  mean  vector  0  and  dispersion  matrix  Z. 
n-1 

Let  V  =  Z  e^eT.  Then  the  r(r  -  I)/2  distinct  elements  of  V  have 
i»l  1  1 

the  Wishart  density  with  (n  -  1)  degrees  of  freedom  and  dispersion 
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c*  i 

/j 


'I 


matrix  £  so  that 


f(ViZ)  = 


_ ly|(n~r-1)/2exp(4  trVE-1) _ 

2(n-l)(r-l)/27r(r-l)(r-2)/4,Z|  (n-l)/2  r'1r|-l(n  . 

~  i  2 


for  V  positive  definite  and  zero  otherwise, f  being  used  generically  to 
denote  a  density  function.  Furthermore*  if  v. is  the  typical  element 

J  J 

of  V,  algebraic  reduction  of  (3)  yields 


Sj  *  ♦  I  [vtt/t(t+l)l  ♦ 

t_J  (12) 

r  “  1  T 

2\  z  v  Jttlt+Dt'ft'+l})15  -  {(j- l)/j>**  z  V.  .  -/{tCt+l)}^  , 
U<t<t'<r  U  t=j  J 

j  s  1 t  •••>  r* 

When  <?.  -  a2,  j  =  1,  ....  r,  Z  ~  a2I^  ,  and  Z'1  =  I  ./a2. 

The  statistics  F,  S,  and  T  in  (6),  (8)  and  (9)  respectively,  and  any 

extensions  to  be  considered,  are  scale  invariant  and  we  may  take 
2 

o=l  without  loss  of  generality  when  considering  the  distributions 

2  2 

of  such  statistics  when  cn  3  o'  ,  j  -  1 . r.  Then 


f(Vsi) 


|>|("-r"1>/W$  try) 


for  V  positive  definite  and  zero  otherwise. 

In  this  section,  we  have  reduced  the  apparent  dependence  of  the 
Sj,  and  the  Qj,  on  the  nr  original  observations  to  dependence  on 
■|r(r  -  1)  new  variables  with  known  joint  distribution  (11).  It  is 
clear  from  (12)  that  only  a  non-singular  linear  transformation  is 


*• .*•  .  >  *•  '• 
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needed  to  obtain  the  joint  distribution  of  S^,  S2  and  S3  when  r  =  3. 
We  turn  to  special  cases:’ 

3.  SPECIAL  CASES 


3.1  Joint  Oistributions»  r  =  3:  When  r  =  3,  the  joint  distributions 
of  Sj.  S2  and  S3  and  of  Qj,  Q2  and  Q3  follow  directly  from  (11),  (12) 
and  (2).  Now 


(vn/2)  +  (v22/6)  +  (v12/i/J), 

(vrl/2)  +  (v22/6)  -  (v12/»/3), 

2Vg2/ 3 , 

f  (a*  +  a*)/2  (aj  -  a|)//l2 


V11  v12 


V12  V22 


and  l  = 


(oj  -  a\)/A 2  (oj  +  a2  +  4o^)/6 


The  linear  transformation  (14)  lets  us  write 


nsj.Sj.Sj 


f  ■  1*3 

e>p[‘  -T&pilyh'  ’  *  fM‘ 

L  J<r  J  J  J 


where  0  <  S.  <  22  S.^/3,  j  -  1,  2,  3,  and  (iS.)c  -  2eS^  >  0.  From  (2), 
3  yi  j  J  j  J 

the  density  of  Qj,  Q2  and  Q3  is 


f(Qx»  Q2.  Q3>  J)  » 


(n  -  l)n  * 

4irr(n  -  2)(  Z 

j<j"  J  J 


[fTTlT  (QlQ2  +  Q1Q3  +  W 


H(n-4) 


exp(-« 


n  -  1 


2  Z  j|tj 

J<j'  J  J 


2  0  ) » 

Ui'  J  J 


(16) 


where  Qj  +  Q2  +  Q3  >  0  and  Q^Q2  +  QjQj  +  Q2Q3  >  0.  Note  that  one 
Qj  nay  be  negative  but  not  two. 

3.2  Russell -Bradley  Test,  r  =  3:  The  non-null  distribution  of  F  in 
(6)  nay  be  found  from  (16)  when  a2  =  °\  ~  °2»  cr|  t  a2.  Now 
F  =  [1  +  4Qj/(Q1  +  Q2)l/3  and  its  density  is 

f(F;  y)  =  Y3s(n-1)F3s(n-3)(Y  +  F)-<n“1  )/B[^(n  -  1),  ^(n  -  1)],  (17) 

where  F  >  0,  y  *  (a2  +  2a2)/3a2.  Thus,  for  this  test,  F/y  has  the 
central  F-distribution  with  (n  -  1)  and  (n  -  1)  degrees  of  freedom  under 
both  Hq1  in  (5),  for  which  y  =  1,  and  H^to2  f  a2  when  r  =  3. 


Details  on  this  result  are  given  by  Brindley  (1982).  The  trans¬ 
formation,  x  -  Qj  +  Q2,  y  =  Qj/iQj  +  Q2),  z  -  03/(0!  +  Q2),  is  used  to 
obtain  (17)  from  (16)  since  the  marginal  dlstributiorTof  2  may  be  ob¬ 
tained  and  F  is  directly  dependent  on  z. 

To  Illustrate  the  results  of  this  subsection,  suppose  that  a  one- 

2  2 

sided  alternative  hypothesis  Is  considered,  say  >  a  .  Let  F* 

be  a  random  variable  with  the  central  F-distribution  with  (n  -  1) 

and  (n  -  1)  degrees  of  freedom  and  let  F  be  such  that 

a 

P(F*  >  Fft)  -  a.  If  F  is  computed  from  (6)  with  r  =  3,  HQ1  in  (5), 
given  A  in  (4),  Is  rejected  at  significance  level  a  if  F  >  F  .  The 

Ot 

2  2 

power  of  the  test  depends  on  y,  specified  when  oyo  Is  specified. 

The  power  of  the  test  for  specified  y  is  evaluated  simply  as 
P(F*  >  F  /y).  The  complementary  one-sided  alternative  hypothesis  or 
the  two-sided  alternative  hypothesis,  t  o  ,  are  considered  in 
similar  ways. 


3.3  Variance  Homogeneity  Test,  r  =  3;  The  distributions  of  T  in  (9) 

and  of  S  in  (8)  may  be  obtained  when  r  =  3  under  both  HQ2  in  (7)  and 

2  2 

the  alternative  hypothesis,  Ha2:oj  t  some  j  t  j',  j,  j"  =  1,  2,  3. 
Indeed,  under  Ha2>  the  distribution  of  T  is 


f(T;  \)  =  i(n 
where  T  >  0, 


2)X^^n_1^^n_4^G[|(n  -  1),  n/2,  1,  (1 


X  =  3  Z  o?o^/(Ea?)2, 
j<j'  33  j  3 


-  T)(l  -  X)].  (18) 


(19) 


and  G  is  the  usual  hypergeometric  function.  When  HQ2  is  true,  X  =  1  and 

f(T;  1)  =  j(n  -  2)T^(n_4),  T  >  0,  (20) 

a  very  simple  density.  Furthermore,  it  follows  at  once  that  the  density 
of  (n  -  2)S/(n  -  1)  is  the  central  chi-square  distribution  with  2  degrees 
of  freedom,  a  result  differing  from  the  approximate  one  of  Russell  and 
Bradley  (1958)  only  by  the  factor  (n  -  2 )/ (n  -  1). 

The  test  of  the  hypothesis  Hq2  may  be  based  on  (n  -  2)S/(n  -  1) 
and  the  chi-square  distribution,  with  large  values  of  the  statistic 
significant.  The  test  may  also  be  done  simply  in  terms  of  T,  with 
small  values  of  T  significant.  Indeed,  under  Hq2  with  r  =  3, 

P(T  <  a2/(n-2)j  =  a.  The  power  of  the  a-level  test  is 

a2/(n-2) 

P(T  <  a2/(n'2*|X)  =  /  f(T,  X)dT 

0 

for  specified  X  with  f(T,  X)  given  in  (18). 
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3.4  Other  Tests,  r  =  3:  The  joint  densities  (15)  and  (16)  are  rela- 

2  2  2  2  2 

tively  simple  when  ~  a\  ~  ~  a  »  a  -  1  without  loss  of  generality. 

Then  we  have 


2  2  ( n-4 )/ 2  'fSj/2 

f(Sr  S2,  S3)  =  - r  [(ZS,T  -  2ZS f]  “  J 


» (n-1 )/2 


2n"17rr(n  -  2)  j  3  j 


(21) 


where  0  <  S.  <  2Z  S../3,  j  =  1,  2,  3,  and  (ZS.)2  -  2ZS?  >  0, 
3  j'  3  j  3  j  3 

and 


f(QrQ2,Q3) 


/„  nn-l  ( n-4 )/ 2  "^n"1,2Qj/3 

-  2)  <QlQ2  +  QlQ3  *  Q2Q3 


(22) 


where  ZQj  >  0  and  +  QjQg  +  Q2Q3  >  0. 


We  have  considered  use  of 


(21)  to  investigate  the  distribution  of  Shukla's  statistic  T*  for  a 
test  of  variance  homogeneity.  The  distribution  of  T*  is  very  intrac¬ 
table  even  when  r  =  3  and  its  use  cannot  be  recommended  in  view  of  the 
simple  results  obtained  above  for  T  when  r  =  3.  Also,  use  of  (21)  to 
rederive  the  distribution  of  Johnson's  statistic,  S./S..,  j  f  j",  leads 

J  J 

to  no  new  insights  or  special  simplifications. 


3.5  A  Joint  Distribution,  r  =  4:  The  joint  distribution  of  the  elements 
of  V  is  given  in  (13),  the  number  of  variates  on  which  Sj,  ...»  Sr  depend 
being  j  r(r  -  1).  We  have  been  able  to  proceed  to  obtain  the  joint 
distribution  of  2r  -  3  variates  on  which  Sj,  ...,  Sr  depend,  see 
Brindley  (1982),  but  the  joint  distribution  of  Sj,  ...,  Sr  must  be  very 
complicated  for  general  r  and  n. 


iw'  O 


s*  V. 


The  very  special  case  with  r  =  4  and  n  =  4  has  been  completed. 


Then 


f (S, ,  S?,  S,,  S.)  =  2'1/2  tt"3/2  min  {AT}  e 
1  6  6  4  l<j<4  J 


1/2  ZS. 
3  J 


or 


f(S.,  S9,  S«,  S,)  =  (2tt)"3/2(zA7  -  2  max  {AT})  e 
1  L  3  4  3  J  11314  J 


-1/2  ZS, 


as  ZS?  -  2  Z  S.S.^  is  less  than  -8(nS.)^2  or  between  -8(nS . ) 3^2  and 
3  J  j<y  33  3  J  3  J 


>1/2 


8(nSj)A/t  respectively,  Sj  >  0,  j  s  1,  ... ,  4,  and  f(Sj,  S2,  S3,  S4) 

3 

is  zero  otherwise. 

It  appears  that  the  case  with  r  *  3,  an  important  case,  may  be  the 
only  situation  for  which  simple  distributions  for  test  statistics  may 
be  found. 


4.  THE  USE  OF  CHARACTERISTIC  FUNCTIONS 

Ellenberg  (1977)  used  the  method  of  characteristic  functions  to 

obtain  the  joint  distribution  of  a  subset  of  k  of  S^,  ....  Sr>  k  ^  r, 

2 

under  the  assumption  that  each  a.  =  1.  We  summarize  for  k  «  r  since 

J 

coefficients  in  the  series  that  he  gives  to  represent  the  distribution 
are  difficult  to  determine. 

Let  <j>(t)  represent  the  characteristic  function  of  Sj . Sr, 

where  t  =  (t^,  ...,  tr).  From  Ellenberg's  formula  (3.2)  for  <f>(t), 
after  algebraic  reduction, 

<(>(t)  =  n(l  -  2it.)"(n‘1)/2{£  Z(1  -  2itif1}  ^  1)/2 
3  J  3  3 


(23) 


where  i  =  in  (23).  Since  (1  -  2 i t ) ~ 2  is  the  characteristic  func¬ 
tion  of  a  chi-square  variate  with  k  degrees  of  freedom,  it  is  appropriate 
to  expand  <j>(t)  in  negative  powers  of  the  (1  -  2i t - ) .  The  result  is  that 

J 


4,(t)  =  I  I  (s)(f)S  Zj‘=s  nrTn(1  ■  2lV  {J“ 

k=0  kir(—  -o  --)  s=0  VSA  r/  Zja  s  nja!  a 


k=o  kir(^-y-^) 


a  a 


and 


00  r(k  +  k 

f(Sj»  •••»  S  )  =  E  n  _  i 

1  r  k=0  kirC11-^)  s 


E  AV-lV  1 -£L 
=oWW  £VSnJ'a! 


a  a 


i  +^-l 

V  2  1 


(24) 


a 


a  J 


:  j.n-1 


0  2  r(ja  +  i^-i) 


exp(-j  ESj), 


where  a  has  the  range  1,  . ...  r  and  S.  >  0,  j  *  1,  ....  r,  in  (24). 

J 

It  is  far  from  apparent  that  (21)  and  (24)  are  identical  when 

r  =  3.  Note  that  (ES.)2  -  2ES^  >  0  for  (21)  but  this  requirement  does 

j  J  j  J 

not  accompany  (24).  Two  verifications  have  been  made. 

The  moment  generating  function,  MQ(9)  ■  E(exp[(n  -  l)Z0.Q./3]>, 

3  j  J  J 

may  be  found  from  (22).  This  is  done  through  use  of  the  transformation, 
u  =  Qj,  v  =  Qj  +  Q2,  w  =  Q3  +  QjQg/tQj  +  Q2).  with  integration  with 
respect  to  w,  u,  and  v  in  turn.  The  result  is  that 

Mq(6)  *  U  -  |iej  +  |-(e1e2  +  9^  +  e2e3)  +  j(e2  +  e2  +  e2)}  1)/2. 

**  J 

Since  Sj  ■  (n  -  1)(4Q1  +  Q2  +  Q3)/3,  S2  =  (n  -  1 ) (Q^  +  4Q2  +  Q3)/3  and 


So  *  (n  -  1)(Q,  +  Q?  +  4Q.J/3,  the  moment  generating  function  of 
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S2  and  S3  is 


•>; 


M3(t)  =  E(exp  Zt.S.)  =  {1  -  |  It.  +  +  tjt3  +  t2t3)}"(n'1)/2. 


corresponding  to  the  characteristic  function  (23)  for  r  =  3  from  which 
(24)  was  derived. 

In  the  very  special  case  with  r  =  3  and  n  =  4,  we  have  been  able 
to  sum  the  series  in  (24)  to  obtain  the  special  form  of  (22).  The 
method  used  seems  awkward  but  an  easier  way  has  not  been  found.  Detail 
are  given  by  Brindley  (1982). 


5.  SIMULATIONS 


■J 


% 


Even  though  we  have  been  unsuccessful  in  obtaining  the  joint  dis¬ 
tribution  of  Qj,  ...»  Qr  in  usable  form  for  r  >  3,  a  test  of  variance 
homogeneity  is  needed.  The  statistic  T  in  (9)  yielded  a  simple  distri¬ 
bution  when  r  =  3  and  has  intuitive  appeal,  particularly  when  written 
in  the  alternative  form, 

(Qt  -  Q2)2  ♦  (Qx  -  Q3)2  +  (Q2  -  Q3)2 

2(Qj  +  Q2  +  Q3)2 

We  consider  the  statistic, 


where 


Xf  =  -(n  -  1)  log  Tr, 


T  =  2r  l  Q.Qr/(r  -  1)(EQ.)2. 
r  j<j'  J  J  j  J 


(25) 

(26) 


Note  that  a  T  and  that  Tr  =  1  in  the  event  that  all  are  equal. 
Choice  of  the  multiplier  (n  -  1)  in  (25)  was  made  after  empirical  inves¬ 
tigation. 

2  2 

1  Let  us  suppose  that,  when  a\  =  ...  *  o£,  xr/ar  tas  the  central 

chi-square  distribution  with  vr  degrees  of  freedom.  If  the  experiment 
is  simulated  a  sufficient  number  of  times,  ar  and  vf  may  be  evaluated. 

Two  series  of  simulations  have  been  conducted,  the  first  to  deter¬ 
mine  af  and  vr  and  the  second  to  confirm  and  to  extend  the  approximate 
procedure  proposed  to  larger  values  of  r.  The  method  referenced  in  the 
Statistical  Analysis  Systems  (SAS)  user's  guide  was  used  to  generate 
the  required  normal  observations;  properties  of  this  generator  are  dis¬ 
cussed  by  Lewis,  Goodman  and  Miller  (1969).  In  each  simulation,  nr 
independent  observations  on  a  standard  normal  variate  were  produced, 
grouped  appropriately  into  n  rows  and  r  columns,  and  the  value  of  Xr 
computed.  For  each  r  and  n,  10,000  simulations  were  employed  to  com¬ 
pute  the  first  and  second  moments  of  X„  from  which  a„  and  v  were 
estimated.  The  first  series  of  simulations  were  done  for  r  =  4,  5  and  6, 
n  *  10,  15,  20,  30  and  50.  Results  are  given  in  Table  I.  Included 
also  in  Table  I  for  comparison,  are  simulation  results  for  r  =  3  and 
the  statistic  (n  -  2)S/(n  -  1)  =  (n  -  2)X^/ (n  -  1),  S  in  (6),  since  this 
statistic  is  now  known  to  have  the  chi-square  distribution  with  2  degrees 
of  freedom. 

The  results  shown  in  Table  I  are  very  interesting.  It  is  apparent 
that  vr  is  very  close  to  (r  -  1).  Closer  examination  suggests  that 
af  is  very  close  to  2/(r  -  1 ) (r  -  2).  Accordingly,  we  suggest  use  of 
the  statistic, 

Yr  *  -  -  l).(r  -  ,2)  1og  (27) 


V  t 


V  %  % 


TABLE  I 


Simulation  Results  to  Determine  a„  and  v 

r  r 


r 

n 

ar 

vr 

3 

10 

1.007 

1.957 

15 

1.039 

1.922 

20 

1.006 

1.982 

30 

0.954 

2.075 

50 

0.972 

2.031 

4 

10 

0.332 

2.994 

15 

0.330 

3.027 

20 

0.327 

3.023 

30 

0.341 

2.921 

50 

0.332 

2.990 

5 

10 

0.168 

3.871 

15 

0.164 

3.924 

20 

0.162 

3.985 

30 

0.163 

3.993 

50 

0.162 

3.955 

6 

10 

0.103 

4.687 

15 

0.099 

4.903 

20 

0.097 

4.973 

30 

0.098 

4.946 

50 

0.097 

4.973 

v  v;  ^  ^ 

for  the  variance  homogeneity  test,  Yr  to  be  taken  to  have  the  central 
chi-square  distribution  with  (r  -  1)  degrees  of  freedom.  Percentage 
points  of  the  simulated  distribution  of  Yr  are  compared  with  those  of 
chi  square  in  Table  II  for  r  -  3,  ....  6,  8,  10,  and  12  for  values  of 
n  shown. 

The  chi-square  approximation  to  the  distribution  of  Yp  seems 
remarkably  good.  In  general,  the  approximation  is  very  slightly 
conservative  except  for  n  *  10  at  the  .01-level,  but  even  here  the 
agreement  is  very  good.  The  approximate  method  proposed  seems  to  effec¬ 
tively  provide  the  desired  test  of  variance  homogeneity  based  on  Grubbs' 
estimators  for  practical  purposes. 

6.  BRIEF  EXAMPLES 

We  examine  the  variance  homogeneity  test  for  two  examples  in  the 
literature.  The  first  example  has  r  *  3  and  the  exact  test  of  Section  3.3 
may  be  used.  The  second  example  has  r  =  4  and  the  approximate  method  of 

Section  5  may  be  used.  In  both  examples,  the  null  hypothesis  has 

2  2 

Hq;  °j  =  a  ,  j  =  1 . r,  and  the  alternative  hypothesis  is 

2  2 

H. ;  o.  =  o%  for  some  j  ^  j'. 

a  j  j 

Russell  and  Bradley  (1958)  provided  data  on  alcohol  yields  for 
three  fermentors  for  n  *  38  days  in  a  distillery.  Day  effects  were 
judged  to  be  important  so  Grubbs'  estimators  were  used.  They  calculated 
Qj  =  0.001537,  Qg  s  0.001722  and  =  0.000041.  The  test  statistic  T 
in  (9)  has  the  value  0.7659.  From  (20),  P(T  <_  TQ)  =  t^1"2^2  and  the 
P-value  for  the  experiment  is  (0.7659)  =  0.0082,  indicating  signi¬ 

ficantly  different  variances  for  the  fermentors  at  the  0.01-level  of 
significance. 
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Graybll 1  (1954)  gave  the  yields  of  r  =  4  varieties  of  wheat  at 
n  *  13  locations  in  Iowa.  It  was  suspected  that  error  variances  differed 
by  varieties  and  that  there  were  location  effects.  Calculation  yields 
Qj  =  875.40,  Q2  *  -84.92,  Q3  =  451.47  and  Q4  =  109.32.  Now  T4  =  0.6109 
in  (26)  and  Y4  in  (27)  has  the  value  17.74,  highly  significant  when 
compared  with  significance  levels  of  chi-square  with  3  degrees  of  freedom. 
These  data  were  considered  also  by  Ellenberg  (1977),  Snee  (1982),  and 
others. 


7.  REMARKS 

Small-sample  distribution  theory  based  on  Grubbs'  estimators 
appears  to  be  very  difficult  in  general,  but  surprisingly  simple  when 
r  =  3.  Statistical  methods  are  most  needed  for  smaller  values  of  r. 

We  have  provided  the  necessary  theory  when  r  =  3  and  a  good  approxi¬ 
mate  test  for  variance  homogeneity  when  r  >  3.  Some  further  investi¬ 
gation  of  the  approximate  test  for  small  values  of  n  may  be  desirable. 

One  warning  should  be  issued.  Tests  on  variances  seem  to  be  more 
sensitive  to  departures  from  the  assumptions  of  normality  than  tests 
on  means.  This  may  be  the  case  also  for  tests  based  on  Grubbs'  estima¬ 
tors  and  some  investigation  of  the  effects  of  nonnormality  is  suggested. 
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geneous  variances  a4,  j  =  1,  ...,  r,  by  columns.  Grubbs  provided  unbiased  estimators 
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Qj  dePend  on  c°lumn  sums  of  squared  residuals  S^.  When  r  =  3,  the  joint 

distributions  of  the  Sj  and  the  are  given  for  the  first  time  in  closed  form. 

Two  tests  proposed  by  Russell  and  Bradley  are  examined  when  r  *  3,  one  for 
variance  homogeneity  and  the  second  for  one  possible  disparate  variance.  A  very 
simple  distribution  is  found  for  the  test  statistic  of  the  first  test  and  its  non-null 
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distribution  Is  derived  also.  The  distribution  of  the  second  test  statistic  was 
known  to  be  the  central  variance-ratio  distribution  in  the  null  case  and  now  its 
ratio  to  a  parameter  of  noncentrality  Is  shown  to  have  the  same  distribution  in 
the  non-null  case. 

When  r  ■  4,  n  »  4,  the  joint  distribution  of  the  is  given  also  in  closed 
form,  but  it  Is  difficult  to  use.  For  r  >  3,  an  approximate  test  of  variance 
homogeneity  is  proposed,  based  on  an  extension  of  the  Russell -Bradley  statistic. 
Extensive  simulation  studies  show  that  the  distribution  of  the  test  statistic 
may  be  approximated  very  well  by  a  chi-square  distribution. 
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